Text to speech in new languages without a standardized orthography

نویسندگان

  • Sunayana Sitaram
  • Gopala Krishna Anumanchipalli
  • Justin Chiu
  • Alok Parlikar
  • Alan W. Black
چکیده

Many spoken languages do not have a standardized writing system. Building text to speech voices for them, without accurate transcripts of speech data is difficult. Our language independent method to bootstrap synthetic voices using only speech data relies upon cross-lingual phonetic decoding of speech. In this paper, we describe novel additions to our bootstrapping method. We present results on eight different languages---English, Dari, Pashto, Iraqi, Thai, Konkani, Inupiaq and Ojibwe, from different language families and show that our phonetic voices can be made understandable with as little as an hour of speech data that never had transcriptions, and without many resources in the target language available. We also present purely acoustic techniques that can help induce syllable and word level information that can further improve the intelligibility of these voices.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Grapheme-to-Phoneme Conversion for Amharic Text-to-Speech System

Developing correct Grapheme-to-Phoneme (GTP) conversion method is a central problem in text-tospeech synthesis. Particularly, deriving phonological features which are not shown in orthography is challenging. In the Amharic language, geminates and epenthetic vowels are very crucial for proper pronunciation but neither is shown in orthography. This paper describes an architecture, a preprocessing...

متن کامل

Text-To-Speech for Languages without an Orthography

Speech synthesis models are typically built from a corpus of speech that has accurate transcriptions. However, many of the languages of the world do not have a standardized writing system. This paper is an initial attempt at building synthetic voices for such languages. It may seem useless to develop a text-to-speech system when there is no text available. But we will discuss some well defined ...

متن کامل

P R O N U N C I at I O N M O D E L I N G F

Natural and intelligible Text to Speech (TTS) systems exist for a number of languages in the world today. However, there are many languages of the world, for which building TTS systems is still prohibitive, due to the lack of linguistic resources and data. Some of these languages are spoken by a large population of the world. Others are primarily spoken languages, or languages with large non-li...

متن کامل

The First Parallel Multilingual Corpus of Persian: Toward a Persian BLARK

In this article, we have introduced the first parallel corpus of Persian with more than 10 other European languages. This article describes primary steps toward preparing a Basic Language Resources Kit (BLARK) for Persian. Up to now, we have proposed morphosyntactic specification of Persian based on EAGLE/MULTEXT guidelines and specific resources of MULTEXT-East. The article introduces Persian ...

متن کامل

DNN-based Speech Synthesis for Indian Languages from ASCII text

Text-to-Speech synthesis in Indian languages has a seen lot of progress over the decade partly due to the annual Blizzard challenges. These systems assume the text to be written in Devanagari or Dravidian scripts which are nearly phonemic orthography scripts. However, the most common form of computer interaction among Indians is ASCII written transliterated text. Such text is generally noisy wi...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2013